Text Data Mining with Optimized Pattern Discovery
نویسنده
چکیده
This paper describes an application of the optimized pattern discovery framework to text and Web mining. In particular, we introduce a class of simple combinatorial patterns over phrases, called proximity phrase association patterns, and consider the problem of nding the patterns that optimizes a given statistical measure in a large collection of unstructured texts. For this class of patterns, we develop fast and robust text mining algorithms based on techniques from computational geometry and string matching. Then, we made experiments on large collections of documents and on Web pages to evaluate the proposed method.
منابع مشابه
Text Data Mining: Discovery of Important Keywords in the Cyberspace
This paper describes applications of the optimized pattern discovery framework to text and Web mining. In particular, we introduce a class of simple combinatorial patterns over phrases, called proximity phrase association patterns, and consider the problem of finding the patterns that optimize a given statistical measure within the whole class of patterns in a large collection of unstructured t...
متن کاملEfficient Text and Semi-structured Data Mining: Knowledge Discovery in the Cyberspace
This paper describes applications of the optimized pattern discovery framework to text and Web mining. In particular, we introduce a class of simple combinatorial patterns over texts such as proximity phrase association patterns and ordered and unordered tree patterns modeling unstructured texts and semi-structured data on the Web. Then, we consider the problem of finding the patterns that opti...
متن کاملHARIALGM: Knowledge Discovery and Data Mining in Pedagogy with DNA Finger Printing
Knowledge Discovery and Data Mining (KDD) is a multidisciplinary area focusing upon methodologies for extracting useful knowledge from data and there are several useful KDD tools to extract the knowledge. The ongoing rapid growth of online data due to the Internet and the widespread use of databases have created an immense need for KDD methodologies. The challenge of extracting knowledge from d...
متن کاملA Survey on Web Log Mining Pattern Discovery
web is a great source of information and knowledge, where a numerous of users find their interest. The data available is in form of structured (relational) and text data. Therefore, different kinds of data model can be implementable with web data for pattern discovery. Web mining is a data mining tool where the web related data is evaluated for pattern discovery and user navigation pattern. Add...
متن کاملWeb Usage Mining Tools & Techniques: A Survey
--The Quest for knowledge has led to new discoveries and invention. That leads to amelioration of various technologies. As years passed World Wide Web became overloaded with information and it became hard to retrieve data according to the need .Web mining came as a violence to provide solution of above problem. Web usage mining is category of web mining. Web usage mining mainly circulation with...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000